Data-adaptive test statistics for microarray data

نویسندگان

  • Sach Mukherjee
  • Stephen J. Roberts
  • Mark J. van der Laan
چکیده

MOTIVATION An important task in microarray data analysis is the selection of genes that are differentially expressed between different tissue samples, such as healthy and diseased. However, microarray data contain an enormous number of dimensions (genes) and very few samples (arrays), a mismatch which poses fundamental statistical problems for the selection process that have defied easy resolution. RESULTS In this paper, we present a novel approach to the selection of differentially expressed genes in which test statistics are learned from data using a simple notion of reproducibility in selection results as the learning criterion. Reproducibility, as we define it, can be computed without any knowledge of the 'ground-truth', but takes advantage of certain properties of microarray data to provide an asymptotically valid guide to expected loss under the true data-generating distribution. We are therefore able to indirectly minimize expected loss, and obtain results substantially more robust than conventional methods. We apply our method to simulated and oligonucleotide array data. AVAILABILITY By request to the corresponding author.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The False Discovery Rate in Simultaneous Fisher and Adjusted Permutation Hypothesis Testing on Microarray Data

Background and Objectives: In recent years, new technologies have led to produce a large amount of data and in the field of biology, microarray technology has also dramatically developed. Meanwhile, the Fisher test is used to compare the control group with two or more experimental groups and also to detect the differentially expressed genes. In this study, the false discovery rate was investiga...

متن کامل

Limma: Linear Models for Microarray Data

A survey is given of differential expression analyses using the linear modeling features of the limma package. The chapter starts with the simplest replicated designs and progresses through experiments with two or more groups, direct designs, factorial designs and time course experiments. Experiments with technical as well as biological replication are considered. Empirical Bayes test statistic...

متن کامل

A Data-Adaptive Approach to cDNA Microarray Image Enhancement

A data-adaptive approach for cDNA microarray image enhancement is presented. Through the weighting coefficients adaptively determined from local microarray image statistics, the proposed technique tunes the overall filter’s detail-preserving and noise-attenuating characteristics and uses both the spatial and spectral correlation of the cDNA image during processing. Noise removal is performed by...

متن کامل

A permutation test motivated by microarray data analysis

We introduce a nonparametric test intended for large-scale simultaneous inference in situations where the utility of distribution-free tests is limited because of their discrete nature. Such situations are frequently dealt with in microarray analysis where the number of tests is much larger than the sample size. The proposed test statistic is based on a certain distance between the distribution...

متن کامل

Bayesian Quantile Regression with Adaptive Elastic Net Penalty for Longitudinal Data

Longitudinal studies include the important parts of epidemiological surveys, clinical trials and social studies. In longitudinal studies, measurement of the responses is conducted repeatedly through time. Often, the main goal is to characterize the change in responses over time and the factors that influence the change. Recently, to analyze this kind of data, quantile regression has been taken ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 21 Suppl 2  شماره 

صفحات  -

تاریخ انتشار 2005